AITopics | continuous state-action gaussian-pomdp

Collaborating Authors

continuous state-action gaussian-pomdp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs

Neural Information Processing SystemsNov-21-2025, 15:22:11 GMT

We present a data-efficient reinforcement learning method for continuous state-action systems under significant observation noise. Data-efficient solutions under small noise exist, such as PILCO which learns the cartpole swing-up task in 30s. PILCO evaluates policies by planning state-trajectories using a dynamics model. However, PILCO applies policies to the observed state, therefore planning in observation space. We extend PILCO with filtering to instead plan in belief space, consistent with partially observable Markov decisions process (POMDP) planning. This enables data-efficient learning under significant observation noise, outperforming more naive methods such as post-hoc application of a filter to policies optimised by the original (unfiltered) PILCO algorithm. We test our method on the cartpole swing-up task, which involves nonlinear dynamics and requires nonlinear control.

continuous state-action gaussian-pomdp, data-efficient reinforcement learning, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Reviews: Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs

Neural Information Processing SystemsOct-7-2024, 22:26:04 GMT

This paper describes an extension to the PILCO algorithm (Probabilistic Inference and Learning for COntrol, a data-efficient reinforcement algorithm). The proposed algorithm applies a measurement filtering algorithm during the actual experiment and explicitly takes this measurement filtering algorithm into account during the policy learning step, which uses data from the experiment. This is an important practical extension addressing the fact that measurements are often very noisy. My intuitive explanation for this approach is that the proposed approach makes the overall feedback system more "repeatable" (noise is mostly filtered out) and therefore learning is faster (given that the filtering is effective, see last sentence of the conclusion). The paper presents detailed mathematical derivations and strong simulation results that highlight the properties of the proposed algorithm.

algorithm, continuous state-action gaussian-pomdp, data-efficient reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs

McAllister, Rowan, Rasmussen, Carl Edward

Neural Information Processing SystemsFeb-14-2020, 09:28:19 GMT

continuous state-action gaussian-pomdp, data-efficient reinforcement learning, significant observation noise, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback